-
Notifications
You must be signed in to change notification settings - Fork 11
Add scan file body endpoint #79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks sensible overall
|
||
### `POST /_matrix/media_proxy/unstable/scan_file` | ||
|
||
Performs a scan on a file body without uploading to Matrix. This request takes a multi-part / form data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just thinking this might be more easily recognisable as a literal content encoding name:
Performs a scan on a file body without uploading to Matrix. This request takes a multi-part / form data | |
Performs a scan on a file body without uploading to Matrix. This request takes a `multipart/form-data` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Content-Disposition#as_a_header_for_a_multipart_body is probably the best link I can find on MDN that describes how this is encoded. A bit weak but might be better than nothing
| `body` | [Blob](https://developer.mozilla.org/en-US/docs/Web/API/Blob) | The file body. | | ||
| `file` | EncryptedFile as JSON string | The metadata (decryption key) of an encrypted file. Follows the format of the `EncryptedFile` structure from the [Matrix specification](https://spec.matrix.org/v1.2/client-server-api/#extensions-to-mroommessage-msgtypes). Only required if the file is encrypted. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unimportant: I kinda find these param names a smidge confusing; why is the file not the file? ;p
Performs a scan on a file body without uploading to Matrix. This request takes a multi-part / form data | ||
body. | ||
|
||
Response format: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks like the request format; meanwhile we don't specify the response format except through example (which is obvious enough that it's probably fine, don't get me wrong)
removal_command_parts = self._removal_command.split() | ||
removal_command_parts.append(file_path) | ||
subprocess.run(removal_command_parts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be quite tempted to pull this out into its own function, to make it easier to modify the removal logic in future
(and if we're being pedantic, really this should be non-blocking/async, but I appreciate this is probably what was already here)
|
||
validate_encrypted_file_metadata(metadata) | ||
|
||
# URL parameter is ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not really sure what this comment means
with open(full_path, "wb") as fp: | ||
while True: | ||
chunk = await multipart.read_chunk() | ||
if not chunk: | ||
break | ||
fp.write(chunk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would probably be better if it was async, but seems that we might have to pull in a library for this. I also guess you aren't the first to do it this way.
If you were interested though, https://pypi.org/project/aiofile/ looks reasonable
if not chunk: | ||
break | ||
fp.write(chunk) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this fails to write the file, should it delete the partial file?
if metadata is not None: | ||
with open(file_path, "rb") as f: | ||
content = f.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm.. is there any point in writing the encrypted file to disk, if we're just going to read it all back in to memory again?
If we can have clients send the encryption metadata as the first body part, that would mean we could do the right thing a little easier.
Although maybe it would be better if we could decrypt files without having them all in memory
That said: if this is all stuff that was a problem before you, it's also fine to leave it
str(uuid.uuid4()), media_content | ||
) | ||
|
||
# Remove source file now we've decrypted it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this also be done in a finally
block if the file fails to decrypt?
Fixes #78
This adds a new endpoint
POST /_matrix/media_proxy/unstable/scan_file
which takes a multipart body of a file and JSON object for decrypting.